4 research outputs found

    Authors\u27 Writing Styles Based Authorship Identification System Using the Text Representation Vector

    Get PDF
    © 2019 IEEE. Text mining is one of the main and typical tasks of machine learning (ML). Authorship identification (AI) is a standard research subject in text mining and natural language processing (NLP) that has undergone a remarkable evolution these last years. We need to identify/determine the actual author of anonymous texts given on the basis of a set of writing samples. Standard text classification often focuses on many handcrafted features such as dictionaries, knowledge bases, and different stylometric characteristics, which often leads to remarkable dimensionality. Unlike traditional approaches, this paper suggests an authorship identification approach based on automatic feature engineering using word2vec word embeddings, taking into account each author\u27s writing style. This system includes two learning phases, the first stage aims to generate the semantic representation of each author by using word2vec to learn and extract the most relevant characteristics of the raw document. The second stage is to apply the multilayer perceptron (MLP) classifier to fix the classification rules using the backpropagation learning algorithm. Experiments show that MLP classifier with word2vec model earns an accuracy of 95.83% for an English corpus, suggesting that the word2vec word embedding model can evidently enhance the identification accuracy compared to other classical models such as n-gram frequencies and bag of words

    Multi source retinal fundus image classification using convolution neural networks fusion and Gabor-based texture representation

    Get PDF
    Glaucoma is one of the most known irreversible chronic eye disease that leads to permanent blindness but its earlier diagnosis can be treated. Convolutional neural networks (CNNs), a branch of deep learning, have an impressive record for applications in image analysis and interpretation, including medical imaging. This necessity is justified by their capacity and adaptability to extract pertinent features automatically from the original image. In other hand, the use of ensemble learning algorithms has an important impact to improve the classification rate. In this paper, a two-stage-based image processing and ensemble learning approach is proposed for automated glaucoma diagnosis. In the first stage, the generation of different modalities from original images is adopted by the application of advanced image processing techniques especially Gabor filter-based texture image. Next, each dataset constructing from the corresponding modality will be learned by an individual CNN classifier. Aggregation techniques will be then applied to generate the final decision taking into account the outputs of all CNNs classifiers. Experiments were carried out on Rime-One dataset for glaucoma diagnosis. The obtained results proved the superiority of the proposed ensemble learning system compared to the existing studies with classification accuracy of 89.63%

    Multi-modal classifier fusion with feature cooperation for glaucoma diagnosis

    Get PDF
    Background: Glaucoma is a major public health problem that can lead to an optic nerve lesion, requiring systematic screening in the population over 45 years of age. The diagnosis and classification of this disease have had a marked and excellent development in recent years, particularly in the machine learning domain. Multimodal data have been shown to be a significant aid to the machine learning domain, especially by its contribution to improving data driven decision-making. Method: Solving classification problems by combinations of classifiers has made it possible to increase the robustness as well as the classification reliability by using the complementarity that may exist between the classifiers. Complementarity is considered a key property of multimodality. A Convolutional Neural Network (CNN) works very well in pattern recognition and has been shown to exhibit superior performance, especially for image classification which can learn by themselves useful features from raw data. This article proposes a multimodal classification approach based on deep Convolutional Neural Network and Support Vector Machine (SVM) classifiers using multimodal data and multimodal feature for glaucoma diagnosis from retinal fundus images from RIM-ONE dataset. We make use of handcrafted feature descriptors such as the Gray Level Co-Occurrence Matrix, Central Moments and Hu Moments to co-operate with features automatically generated by the CNN in order to properly detect the optic nerve and consequently obtain a better classification rate, allowing a more reliable diagnosis of glaucoma. Results: The experimental results confirm that the combination of classifiers using the BWWV technique is better than learning classifiers separately. The proposed method provides a computerized diagnosis system for glaucoma disease with impressive results comparing them to the main related studies that allow us to continue in this research path

    Multi-classifier system for authorship verification task using word embeddings

    Get PDF
    © 2018 IEEE. Authorship Verification is considered as a topic of growing interest in research, which has shown excellent development in recent years. We want to know if an unknown document belongs to the documents set known to an author or not. Classical text classifiers often focus on many human designed features, such as dictionaries, knowledge bases and special tree kernels. Other studies use the N-gram function that often leads to the curse of dimensionality. Contrary to traditional approaches, this article proposes a new scheme of Machine Learning model based on fusion of three different architectures namely, Convolutional Neural Networks, Recurrent-Convolutional Neural Networks and Support Vector Machine classifiers without human-designed features. Word2vec based Word Embeddings is proposed to learn the best word representations for automatic authorship verification. Word Embeddings provides semantic vectors and extracts the most relevant information about raw text with a relatively small dimension. As well as the classifiers generally make different errors on the same learning samples which results in a combination of several points of view to maintain relevant information contained in different classifiers. The final decision of our system is obtained by combining the results of the three models using the voting method
    corecore